Steady-State Planning in Expected Reward Multichain MDPs
نویسندگان
چکیده
The planning domain has experienced increased interest in the formal synthesis of decision-making policies. This typically entails finding a policy which satisfies specifications form some well-defined logic. While many such logics have been proposed with varying degrees expressiveness and complexity their capacity to capture desirable agent behavior, value is limited when deriving policies satisfy certain types asymptotic behavior general system models. In particular, we are interested specifying constraints on steady-state an agent, captures proportion time spends each state as it interacts for indefinite period its environment. sometimes called average or expected associated problem faced significant challenges unless strong restrictions imposed underlying model terms connectivity graph structure. this paper, explore that consists satisfied. A linear programming solution case multichain Markov Decision Processes (MDPs) prove optimal solutions programs yield stationary rigorous guarantees behavior.
منابع مشابه
Total Expected Discounted Reward MDPs: Existence of Optimal Policies
This article describes the results on the existence of optimal and nearly optimal policies for Markov Decision Processes (MDPs) with total expected discounted rewards. The problem of optimization of total expected discounted rewards for MDPs is also known under the name of discounted dynamic programming.
متن کاملExploiting separability in multiagent planning with continuous-state MDPs
Recent years have seen significant advances in techniques for optimally solving multiagent problems represented as decentralized partially observable Markov decision processes (Dec-POMDPs). A new method achieves scalability gains by converting Dec-POMDPs into continuous state MDPs. This method relies on the assumption of a centralized planning phase that generates a set of decentralized policie...
متن کاملSteady state behavior and maintenance planning of bleaching system in a paper plant
This paper presents the steady state behavior and maintenance planning of the bleaching system in a paper plant. The paper plant comprises of various systems including feeding, chipping, digesting, washing, bleaching, screening, stock preparation and paper making, etc. One of the most important functionaries of a paper plant, on which quality of paper depends, is the bleaching system, where rem...
متن کاملSteady-state analysis of shortest expected delay routing
We consider a queueing system consisting of two non-identical exponential servers, where each server has its own dedicated queue and serves the customers in that queue FCFS. Customers arrive according to a Poisson process and join the queue promising the shortest expected delay, which is a natural and near-optimal policy for systems with non-identical servers. This system can be modeled as an i...
متن کاملRobust Online Optimization of Reward-Uncertain MDPs
Imprecise-reward Markov decision processes (IRMDPs) are MDPs in which the reward function is only partially specified (e.g., by some elicitation process). Recent work using minimax regret to solve IRMDPs has shown, despite their theoretical intractability, how the set of policies that are nondominated w.r.t. reward uncertainty can be exploited to accelerate regret computation. However, the numb...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Artificial Intelligence Research
سال: 2021
ISSN: ['1076-9757', '1943-5037']
DOI: https://doi.org/10.1613/jair.1.12611